Systems and methods for instrumenting loops of an executable program

ABSTRACT

Systems and methods for instrumenting a loop of an executable program are disclosed. One embodiment relates to a method of inserting instrumentation code into an executable program. The method may comprise inserting a register adder initialization instruction prior to a loop entry point of a loop in an executable program such that paths reaching the loop entry point also reaches the register adder initialization instruction, inserting a register add instruction between the loop entry point and prior to a back edge of the loop, and inserting a loop counter update instruction after the back edge of the loop.

CROSS REFERENCE TO RELATED APPLICATIONS

This application is related to the following commonly assignedco-pending patent application entitled: “SYSTEMS AND METHODS FOR BRANCHPROFILING LOOPS OF AN EXECUTABLE PROGRAM,” Attorney Docket No.200313027-1, which is filed contemporaneously herewith and isincorporated herein by reference.

BACKGROUND

Code instrumentation is a method for analyzing and evaluating programcode performance. Source instrumentation modifies a program's originalsource code, while binary instrumentation modifies an existing binaryexecutable. In one approach to binary code instrumentation, newinstructions or probe code are added to an executable program, andconsequently, the original code in the program is changed and/orrelocated. Some examples of probe code include adding values to aregister, moving the address of some data to some registers, and addingcounters to determine how many times a function is called. The changedand/or relocated code is referred to as instrumented code, or moregenerally, as an instrumented process.

One specific type of code instrumentation is referred to as dynamicbinary instrumentation. Dynamic binary instrumentation allows programinstructions to be changed on-the-fly. Measurements such as basic-blockcoverage and function invocation counting can be accurately determinedusing dynamic binary instrumentation. Additionally, dynamic binaryinstrumentation, in contrast to static instrumentation, is performed atrun-time of a program and only instruments those parts of an executablethat are actually executed. This minimizes the overhead imposed by theinstrumentation process itself. Furthermore, performance analysis toolsbased on dynamic binary instrumentation require no special preparationof an executable such as, for example, a modified build or link process.

SUMMARY

One embodiment of the present invention may comprise a system forinstrumenting loops of an executable program. The system may comprise adynamic instrumentation tool that inserts a register add instructionassociated with a back edge of a loop in an executable program and aloop counter update instruction associated with an exit point of theloop. The register add instruction may increment a register value withexecuted iterations of the loop for a given loop execution, and the loopcounter update instruction may update a loop counter value based on theregister value at completion of the given loop execution. The system mayhave a shared memory that retains the loop counter value associated witha total number of loop iterations of the loop.

Another embodiment may comprise a method of inserting instrumentationcode into a loop of an executable program. The method may compriseinserting a register adder initialization instruction prior to a loopentry point of a loop in an executable program such that paths reachingthe loop entry point also reach the register adder initializationinstruction, inserting a register add instruction between the loop entrypoint and prior to a back edge of the loop, and inserting a loop counterupdate instruction after the back edge of the loop.

Yet another embodiment of the present invention may relate to a computerreadable medium having computer executable instruction for performing amethod. The method may comprise performing loop analysis on anexecutable program to identify at least one loop, assigning a registeradd instruction to a back edge of the at least one loop, and assigning aloop counter update instruction to an exit point associated with the atleast one loop.

Still another embodiment may relate to a dynamic instrumentation system.The dynamic instrumentation system may comprise means for generating anintermediate representation of a function associated with an executableprogram, means for analyzing the intermediate representation to identifyat least one loop in the function, and means for inserting code into theidentified at least one loop. The means for inserting code may insert aregister add instruction between a loop entry point and a back edge ofthe identified at least one loop, and a loop counter update instructionafter the back edge of the identified at least one loop. The dynamicinstrumentation system may comprise means for encoding the inserted codeand the intermediate representation of the function to produce aninstrumented function.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 illustrates an embodiment of a dynamic instrumentation system.

FIG. 2 illustrates an embodiment of components associated with a dynamicinstrumentation tool.

FIG. 3 illustrates an embodiment of a block diagram of contents of aportion of shared memory.

FIG. 4 illustrates an embodiment of a loop associated with an executableprogram having instrumentation counters inserted therein.

FIG. 5 illustrates a methodology for inserting instrumentation code intoloops of an executable program.

FIG. 6 illustrates an embodiment of an alternate methodology forinserting instrumentation code into loops of an executable program.

FIG. 7 illustrates an embodiment of yet another alternate methodologyfor inserting instrumentation code into loops of an executable program.

FIG. 8 illustrates an embodiment of a computer system.

DETAILED DESCRIPTION

This disclosure relates generally to dynamic instrumentation systems andmethods. A loop analysis is performed on an executable program toidentify loops associated with the executable program. A register addinstruction is inserted at a back edge of a loop, and a loop counterupdate instruction is inserted at an exit point associated with theloop. A back edge of the loop is a branch from the bottom of the loop toan entry point of the loop that builds the loop cycle. A register addinstruction increments a register value based on loop iterationsassociated with a loop execution. The loop counter update instructionupdates a loop counter that maintains a count of loop iterations over aplurality of loop executions. The loop counter update instruction caninclude one or more instructions to update a loop counter (e.g., storedin memory). The number of instructions for updating the loop counter isbased on the particular processor architecture being employed.

During program execution, the register add instruction increments aregister value with executed iterations of an executed loop. Theregister counter instruction can employ a free register of the system(e.g., processor architecture). A free register is a register that canbe safely modified without modifying the program semantics of theexecutable program. The employment of a free register provides formulti-thread safe operation of the instrumentation counter.Additionally, register add instructions are substantially faster andshorter (less code size) than instructions to increment a counter inmemory. Thus, employing register add instructions instead of loopcounter memory update instructions for counting loop iterations providesfor improved execution speeds associated with an instrumented executableprogram.

The loop counter update instruction can be embedded in a multi-threadsafe set of ownership instructions, such a spinlock operation. Aspinlock operation provides a thread with ownership of the loop countervalue stored in memory preventing other threads from incrementing theloop counter value, until the ownership is released.

FIG. 1 illustrates a dynamic instrumentation system 10. The dynamicinstrumentation system 10 can be a computer, a server or some othercomputer medium that can execute computer readable instructions. Forexample, the components of the system 10 can be computer executablecomponents, such as can be stored in a desired storage medium (e.g.,random access memory, a hard disk drive, CD ROM, and the like), computerexecutable components running on a computer. The dynamic instrumentationsystem 10 includes a dynamic instrumentation tool 12. The dynamicinstrumentation tool 12 interfaces with an executable program orexecutable program 14 to assign instrumentation (e.g., counters) to theexecutable program 14.

The dynamic instrumentation tool 12 is operative to assigninstrumentation counters and insert instrumentation counter instructionsin at least one loop associated with the executable program 14. Theinstrumentation counters include a register adder that counts iterationsassociated with a loop execution, and a loop counter that maintains acount associated with total loop iterations over one or more loopexecutions. The dynamic instrumentation tool 12 is operative to assign afree register to the at least one loop. A free register can be found byanalyzing the executable program 14 to determine which registers are notused by the executable program. Additionally, the code can be analyzedto determine which registers are currently available for use that wouldnot interfere with the program execution. It is to be appreciated that avariety of techniques can be employed to find a free register.

The dynamic instrumentation tool 12 can load the executable program 14and insert breaks at a beginning of each function under the control of adebugging interface, which is provided by the operating system (e.g.,ttrace( ) on HP-UX® Operating System, ptrace( )on LINUX® OperatingSystem, Extended Debugging Interface (eXDI) on MICROSOFT WINDOWS®Operating System). The executable program 14 then is executed. Thedebugging interface makes it possible to transfer control from thetarget application to the dynamic instrumentation tool 12 whenever abreak is encountered in the executable program.

As the executable program 14 encounters the breaks corresponding to anew reached function, control is passed to the dynamic instrumentationtool 12. The dynamic instrumentation tool 12 loads the function. Thedynamic instrumentation tool 12 then converts the function into anintermediate representation by decoding the binary code associated withthe function and converting the decoded binary code via an intermediaterepresentation instrument. A control flow graph constructor thengenerates a control flow graph from the intermediate representation. Aloop analysis is then performed on the intermediate representation by aloop recognition algorithm. The dynamic instrumentation tool 12 can theninsert one or more instrumentation counters via a probe codeinstrumenter.

The loop counter updates can be minimized by inserting register addersin the innermost loops of the executable program 14. The innermost loopsof the executable program are loops that contain no inner loops, whilethe outermost loops are not nested in any outer loop. Intermediate loopsare loops that are both inner loops and outer loops, such that theintermediate loop is a loop that is nested in one or more outer loopsand also contain one or more inner loops nested therein. The executionspeed of the intstrumented code can be improved by generating freeregisters for innermost loops first, intermediate loops second, andoutermost loops last, as long as free registers are available.Typically, loop counters are employed to count loop iterations byutilizing atomic memory update instructions. The atomic memory updateinstructions are multi-thread safe, but are substantially time intensive(e.g., about 20 clock cycles) as compared to a register add instruction(e.g., about 1 clock cycle).

In one embodiment of the present invention, a register adderinitialization instruction is inserted prior to an entry point of theloop in a way such that paths reaching the loop entry point also reachthe register adder initialization instruction. A register addinstruction is inserted prior to or at a back edge of the loop, orbetween the entry point and the back edge. The register add instructionemploys the free register to increment a loop count value for iterationsof a loop during a loop execution. The register add instruction issubstantially faster than an atomic memory update instruction. A loopcounter update instruction is then inserted prior to an exit point ofthe loop and after the back edge of the loop. The loop counter updateinstruction maintains a count associated with total loop iterations overone or more loop executions. The loop counter value is retained in acorresponding memory location associated with a respective loop. Theloop counter update instruction can be embedded in a multi-thread safeset of ownership instructions, such as a spinlock operation.

The dynamic instrumentation tool 12 then encodes the modified functioncode to provide an instrumented function in binary form. Theinstrumented function is stored in a shared memory 18. The originalentry point of the function (where the break point was placed) ispatched with a branch/jump to the instrumented version of the function.Execution is then resumed at the address of the instrumented function(e.g., resume can be an option in the debug interface). Therefore,control has been transferred back to the executable program, whichcontinues to execute until another breakpoint at a new non-encounteredfunction is encountered. The process then repeats for the next functionuntil all function have been instrumented. Once the executable program14 and instrumented functions have completed execution, the dynamicinstrumentation tool 12 can retrieve the loop counter values from theshared memory 18.

FIG. 2 illustrates components associated with a dynamic instrumentationtool 40. The dynamic instrumentation tool 40 includes a decoder and anintermediate representation (IR) instrument 42 that reads in the binaryfunction, and decodes the binary function into an intermediaterepresentation. A control flow graph constructor 44 can configure theintermediate representation as a control flow graph with basic blocksand edges between those blocks representing possible flows of control. Aloop analysis can be performed on the loop by a loop recognitionalgorithm 46. The loop recognition algorithm 46 can be one of manydifferent algorithms known for recognizing loops in a control flowgraph.

The dynamic instrumentation tool 40 also includes a probe codeinstrumenter 48. The probe code instrumenter 48 can insert a registeradder initialization instruction prior to an entry point of the loop ina way such that every path reaching the loop entry point also reachesthe register adder initialization instruction, a register addinstruction prior to or at a back edge of the loop, or between the entrypoint and the back edge, and a loop counter update instruction prior toan exit point of the loop and after the back edge of the loop. The probecode instrumenter 48 can generate free registers associated with theregister add instructions for one or more innermost loops, as long asfree registers are available. The dynamic instrumentation tool 40includes an encoder 50 that encodes the IR instrumented function into abinary instrumented function. The dynamic instrumentation tool 40includes a process control 52 that stores the binary instrumentedfunction in shared memory, patches a branch/jump instruction in theexecutable program where the break point was placed, and passes controlback to the executable program.

FIG. 3 illustrates a block diagram of contents of a portion of sharedmemory 60 associated with instrumenting loops of an executable program.The shared memory 60 retains loop counter values for loops, labeled 1 toN, in the executable program, where N is an integer greater than orequal to one. The loop counter values can correspond to the number ofexecuted iterations of innermost loops, outermost loops and/orintermediary loops that have executed in the executable program.Additionally, the loop counter values can correspond to a singlefunction, or a plurality of functions associated with the executableprogram. The loop counter values are updated each time a loop completesexecution in the executable program and a loop exit point isencountered. The loop counter values are updated by adding the value ofthe register adder that corresponds to the number of loop iterationsassociated with a loop execution. Since the loop counter values residein shared memory 60, the loop counter values are not multi-thread safe.

Therefore, the shared memory 60 includes counter access flags, labeledC1AF through CNAF, associated with each loop counter value. The counteraccess flags are employed to maintain ownership of the loop countervalue memory spaces by a single process at a time, so that loop countervalue integrity is maintained. For example, if a process desires tooverwrite a corresponding loop counter value, the process will requestcontrol of the loop counter value by checking the corresponding counteraccess flag. If the counter access flag is not set, the process will setthe flag and update the corresponding loop counter value. The processwill then reset the flag and release control of the loop counter value,so that other processes may access the loop counter value in sharedmemory 60. In this manner, the loop counter values maintain loop countervalue integrity by being multi-thread safe.

The shared memory 60 also retains a plurality of instrumented functions,labeled 1 through K, where K is an integer greater than or equal to one.The dynamic instrumentation tool stores the encoded instrumentedfunctions in shared memory 60 to provide ready access to both theinstrumentation tool and the executable program. A branch/jumpinstruction is employed as a patch at the start of a non-instrumentedfunction, so whenever the original entry point of the non-instrumentedfunction is reached, execution resumes/continues at the instrumentedversion of the function. Once the executable program is instrumented, asubstantial portion of executable program execution occurs in sharedmemory 60 via the instrumented functions corresponding to thenon-instrumented functions that have been reached.

FIG. 4 illustrates a loop 70 associated with an executable programhaving instrumentation counters inserted therein. The loop 70 can residein a function in the executable program. The loop 70 can be an innermostloop, an outermost loop or an intermediary loop. The loop 70 includesinstrumentation code provided by a dynamic instrumentation tool. Thedynamic instrumentation tool assigns a free register to the loop 70 andinserts a register adder initialization instruction 72 (Rx=0) at line001 prior to a loop entry point at line 002, such that paths reachingthe loop entry point also reach the register adder initializationinstruction 72. The dynamic instrumentation tool also inserts a registeradd instruction 74 (Rx=Rx+1) at line 003 between the loop entry pointand a back edge of the loop 70 at lines 004 and 005. The register addinstruction 74 causes the value of a free register to be incremented(e.g., by one) each loop iteration associated with a loop execution.

The dynamic instrumentation tool also inserts a loop counter updateinstruction 76 (Counter1=Counter1+Rx) at line 007 after the back edge ofthe loop and prior to an exit point of the loop 70 at 009. Execution ofthe loop counter update instruction 76 causes a loop counter value inshared memory to be updated by adding the value of the register adder(Rx) to the loop counter value in shared memory.

In certain circumstances, the number of iterations is fixed. Forexample, when a programmer employs numerical integer constants to denotethe loop start, end and increment values. This can be found by the looprecognition algorithm, and an exact trip count can be derived. If theloop contains no other exits, we know that the loop will execute“trip-count” times. In this situation, a register add instruction is notnecessary and the loop counter update instruction simply increments theloop counter value by a fixed number of loops (e.g., 10).

The loop counter update instruction 76 is embedded in memory ownershipinstructions, such that ownership of the loop counter value memorylocation is requested prior to updating of the loop counter valuememory. For example, a spinlock command is a set of instructions thatrequests access of a loop counter value by checking the state of a loopaccess flag via a set of spinlock access instructions illustrated atline 006. The loop counter value (Counter1) is then updated by executionof the loop counter update instruction 76. The loop access flag is thenreset via a set of spinlock release instructions illustrated at line008, thus releasing ownership control of the memory location associatedwith the loop counter value. Although a single instruction is shown forillustrating a spinlock access instruction set, a loop counter updateinstruction and a spinlock release instruction set, a plurality ofinstructions can be employed to execute any of a spinlock access, a loopcounter update and a spinlock reset.

The dynamic instrumentation tool can assign a free register, insert theregister adder initialization instruction, the register add instructionand the loop counter update instruction in one or more loops. In oneembodiment, the dynamic instrumentation tool assigns a free register,inserts the register adder initialization instruction, the register addinstruction and the loop counter update instruction set for a pluralityof innermost loops firstly, intermediate loops secondly, and outermostloops lastly, as long as free registers are available.

In view of the foregoing structural and functional features describedabove, certain methods will be better appreciated with reference toFIGS. 5-7. It is to be understood and appreciated that the illustratedactions, in other embodiments, may occur in different orders and/orconcurrently with other actions. Moreover, not all illustrated featuresmay be required to implement a method. It is to be further understoodthat the following methodologies can be implemented in hardware (e.g., acomputer or a computer network as one or more integrated circuits orcircuit boards containing one or more microprocessors), software (e.g.,as executable instructions running on one or more processors of acomputer system), or any combination thereof.

FIG. 5 illustrates a methodology for inserting instrumentation code intoloops of an executable program. The methodology begins at 100 where anexecutable program is analyzed and breaks are inserted before eachfunction. The executable program then begins execution, until abreakpoint is encountered for a given function. Once a breakpoint isencountered, the methodology proceeds to 120. At 120, a determination ismade as to whether the executable program has completed execution. Ifthe executable program has completed execution (YES), the methodologyproceeds to 140 to retrieve the instrumentation values. If theexecutable program has not completed execution (NO), the methodologyproceeds to 130.

At 130, the dynamic instrumentation tool decodes the executable functionand generates an intermediate representation of the given function, andgenerates a control flow graph from the intermediate representation. Thedynamic instrumentation tool then performs loop recognition analysis onthe control flow graph to identify loops in the given function at 150.After the loops have been identified, the methodology proceeds to 160.

At 160, one or more instrumentation counters are inserted into one ormore loops associated with the given function. A register adderinitialization instruction is inserted prior to an entry point of a loopin a way such that every path reaching the loop entry point also reachesthe register adder initialization instruction. A register addinstruction is inserted prior to or at a back edge of the loop, orbetween the entry point and the back edge. The register add instructionemploys a free register to increment a loop count value for iterationsof a loop during a loop execution. A loop counter update instruction isthen inserted prior to an exit point of the loop and after the back edgeof the loop. The loop counter update instruction maintains a countassociated with total loop iterations over one or more loop executions.The loop counter value is retained in a corresponding memory locationassociated with a respective loop. The loop counter update instructioncan be embedded in a multi-thread safe set of ownership instructions,such a spinlock operation.

At 170, the modified instrumented executable function is encoded into abinary executable, and stored in shared memory. At 180, the break in theexecutable program associated with the given function is replaced with abranch/jump to the instrumented function and control is returned to theexecutable program. The methodology then proceeds to 190 where executionis continued at the start of the instrumented function. The methodologythen returns to 110 until the next breakpoint is encountered.

FIG. 6 illustrates an alternate methodology for insertinginstrumentation code in an executable program. At 200, a register adderinitialization instruction is inserted prior to a loop entry point inthe executable program in a way such that every path reaching the loopentry point also reaches the register adder initialization instruction.At 210, a register add instruction is inserted between the loop entrypoint and a back edge of the loop. At 220, a loop counter updateinstruction is inserted after the back edge of the loop.

FIG. 7 illustrates yet another alternate methodology for insertinginstrumentation code in an executable program. At 250, loop analysis isperformed on an executable program to identify at least one loop. At260, a register add instruction is assigned to a back edge of the atleast one loop. At 270, a loop counter update instruction is assigned toan exit point associated with the at least one loop.

FIG. 8 illustrates a computer system 320 that can be employed to executeone or more embodiments employing computer executable instructions. Thecomputer system 320 can be implemented on one or more general purposenetworked computer systems, embedded computer systems, routers,switches, server devices, client devices, various intermediatedevices/nodes and/or stand alone computer systems.

The computer system 320 includes a processing unit 321, a system memory322, and a system bus 323 that couples various system componentsincluding the system memory to the processing unit 321. Dualmicroprocessors and other multi-processor architectures also can be usedas the processing unit 321. The system bus may be any of several typesof bus structure including a memory bus or memory controller, aperipheral bus, and a local bus using any of a variety of busarchitectures. The system memory includes read only memory (ROM) 324 andrandom access memory (RAM) 325. A basic input/output system (BIOS) canreside in memory containing the basic routines that help to transferinformation between elements within the computer system 320.

The computer system 320 can includes a hard disk drive 327, a magneticdisk drive 328, e.g., to read from or write to a removable disk 329, andan optical disk drive 330, e.g., for reading a CD-ROM disk 331 or toread from or write to other optical media. The hard disk drive 327,magnetic disk drive 328, and optical disk drive 330 are connected to thesystem bus 323 by a hard disk drive interface 332, a magnetic disk driveinterface 333, and an optical drive interface 334, respectively. Thedrives and their associated computer-readable media provide nonvolatilestorage of data, data structures, and computer-executable instructionsfor the computer system 320. Although the description ofcomputer-readable media above refers to a hard disk, a removablemagnetic disk and a CD, other types of media which are readable by acomputer, such as magnetic cassettes, flash memory cards, digital videodisks and the like, may also be used in the operating environment, andfurther that any such media may contain computer-executableinstructions.

A number of program modules may be stored in the drives and RAM 325,including an operating system 335, one or more executable programs 336,other program modules 337, and program data 338. A user may entercommands and information into the computer system 320 through a keyboard340 and a pointing device, such as a mouse 342. Other input devices (notshown) may include a microphone, a joystick, a game pad, a scanner, orthe like. These and other input devices are often connected to theprocessing unit 321 through a corresponding port interface 346 that iscoupled to the system bus, but may be connected by other interfaces,such as a parallel port, a serial port or a universal serial bus (USB).A monitor 347 or other type of display device is also connected to thesystem bus 323 via an interface, such as a video adapter 348.

The computer system 320 may operate in a networked environment usinglogical connections to one or more remote computers, such as a remoteclient computer 349. The remote computer 349 may be a workstation, acomputer system, a router, a peer device or other common network node,and typically includes many or all of the elements described relative tothe computer system 320. The logical connections can include a localarea network (LAN) 351 and a wide area network (WAN) 352.

When used in a LAN networking environment, the computer system 320 canbe connected to the local network 351 through a network interface oradapter 353. When used in a WAN networking environment, the computersystem 320 can include a modem 354, or can be connected to acommunications server on the LAN. The modem 354, which may be internalor external, is connected to the system bus 323 via the port interface346. In a networked environment, program modules depicted relative tothe computer system 320, or portions thereof, may be stored in theremote memory storage device 350.

What have been described above are examples of the present invention. Itis, of course, not possible to describe every conceivable combination ofcomponents or methodologies for purposes of describing the presentinvention, but one of ordinary skill in the art will recognize that manyfurther combinations and permutations of the present invention arepossible. Accordingly, the present invention is intended to embrace allsuch alterations, modifications and variations that fall within thespirit and scope of the appended claims.

1. A system for instrumenting a loop of an executable program, thesystem comprising: a dynamic instrumentation tool that inserts aregister add instruction associated with a back edge of the loop in anexecutable program and a loop counter update instruction associated withan exit point of the loop, the register add instruction increments aregister value with executed iterations of the loop for a given loopexecution, and the loop counter update instruction updates a loopcounter value based on the register value at completion of the givenloop execution; and a shared memory that retains the loop counter valueassociated with a total number of loop iterations of the loop.
 2. Thesystem of claim 1, wherein the loop counter update instruction isembedded in a set of loop counter value ownership instructions thatfacilitate multi-threaded safe loop counter value integrity.
 3. Thesystem of claim 2, wherein the shared memory retains a loop counteraccess flag associated with the loop, the set of loop counter valueownership instructions comprising at least a first instruction forrequesting access to the loop counter value and setting the loop counteraccess flag prior to updating the loop counter value, and at least asecond instruction for resetting the loop counter access flag afterupdating the loop counter value wherein access of the loop counter valueis controlled based on the state of the loop counter access flag.
 4. Thesystem of claim 1, wherein the register value is retained in a freeregister of the system.
 5. The system of claim 1, wherein the dynamicinstrumentation tool dynamically assigns a respective free register,inserts a register add instruction associated with a back edge and aloop counter update instruction associated with an exit point of aninnermost loop for each of a plurality of innermost loops.
 6. The systemof claim 1, wherein the loop is at least one of an innermost loop, anintermediary loop and an outermost loop of the executable program. 7.The system of claim 1, wherein the dynamic instrumentation tool decodesa given function of the executable program into an intermediaterepresentation, constructs a control flow graph and performs a looprecognition to identify loops in the given function.
 8. The system ofclaim 7, wherein the dynamic instrumentation tool encodes the givenfunction with the inserted register add instruction and the loop counterupdate instruction to provide an instrumented function.
 9. The system ofclaim 8, wherein the dynamic instrumentation tool stores theinstrumented function in shared memory and inserts a branch/jump to theinstrumented function at the given function in the executable program.10. The system of claim 1, wherein the dynamic instrumentation toolinserts a register adder initialization instruction prior to a loopentry point such that paths reaching the loop entry point also reachesthe register adder initialization instruction.
 11. A method of insertinginstrumentation code into a loop of an executable program, the methodcomprising: inserting a register adder initialization instruction priorto a loop entry point of the loop such that paths reaching the loopentry point also reach the register adder initialization instruction;inserting a register add instruction between the loop entry point and aback edge of the loop; and inserting a loop counter update instructionafter the back edge of the loop.
 12. The method of claim 11, furthercomprising finding a free register to assign to the loop, the freeregister being initialized by the register adder initializationinstruction and incremented by the register add instruction for eachloop iteration associated with a loop execution of the loop.
 13. Themethod of claim 11, further comprising repeating the inserting aregister adder initialization instruction, inserting a register addinstruction and inserting a loop counter update instruction for aplurality of innermost loops in a function of the executable program.14. The method of claim 11, further comprising inserting the loopcounter update instruction between a loop counter value ownershiprequest instruction and a loop counter value ownership releaseinstruction, wherein a loop counter value access flag is set whenownership of the loop counter value is provided and reset when ownershipof the loop counter value is released.
 15. The method of claim 11,wherein the inserting a register adder initialization instruction,inserting a register add instruction and inserting a loop counter updateinstruction for the loop is performed dynamically for a given functionof the executable program as functions are executed.
 16. A computerreadable medium having computer executable instruction for performing amethod comprising: performing loop analysis on an executable program toidentify at least one loop; assigning a register add instruction to aback edge of the at least one loop; and assigning a loop counter updateinstruction to an exit point associated with the at least one loop. 17.The computer readable medium having computer executable instruction forperforming the method claim 16, wherein the performing a loop analysison an executable program comprises: representing a function of theexecutable program as an intermediate representation; constructing acontrol flow graph from the intermediate representation; and performinga loop recognition algorithm on the control flow graph to identify atleast one loop in the function.
 18. The computer readable medium havingcomputer executable instruction for performing the method claim 16,wherein the assigning a register add instruction comprises inserting aregister add instruction between a loop entry point of the at least oneloop and a back edge of the at least one loop, and assigning a loopcounter update instruction comprises inserting a loop counter updateinstruction after the back edge the at least one loop and prior to anexit point of the at least one loop.
 19. The computer readable mediumhaving computer executable instruction for performing the method claim18, further comprising inserting a register adder initializationinstruction prior to the loop entry point of the at least one loop, suchthat paths reaching the loop entry point also reaches the register adderinitialization instruction.
 20. The computer readable medium havingcomputer executable instruction for performing the method claim 19,further comprising encoding the inserted instructions along with the atleast one loop for an associated function to generate an instrumentedfunction, and storing the instrumented function in memory.
 21. Thecomputer readable medium having computer executable instruction forperforming the method claim 16, wherein the performing a loop analysisis performed dynamically on each function in the executable program asthe executable program executes, such that the assigning a register addinstruction to a back edge of the at least one loop, and assigning aloop counter update instruction to an exit point associated with the atleast one loop is repeated for each function that includes at least oneloop.
 22. A dynamic instrumentation system comprising: means forgenerating an intermediate representation of a function associated withan executable program; means for analyzing the intermediaterepresentation to identify at least one loop in the function; means forinserting code into the identified at least one loop, the means forinserting code inserting a register add instruction between a loop entrypoint and a back edge of the identified at least one loop, and a loopcounter update instruction after the back edge of the identified atleast one loop; and means for encoding the inserted code and theintermediate representation of the function to produce an instrumentedfunction.
 23. The system of claim 22, wherein the means for insertingcode into the identified at least one loop comprising inserting aregister adder initialization instruction prior to a loop entry point ofthe identified at least one loop, such that paths reaching the loopentry point also reaches the register adder initialization instruction.24. The system of claim 22, further comprising means for storing a loopcounter value associated with execution of the loop counter updateinstruction.
 25. The system of claim 22, wherein the means for insertingcode into the identified at least one loop comprising embedding the loopcounter update instruction between loop counter value ownershipinstructions that facilitate multi-threaded safe loop counter valueintegrity.